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WHAT IS CLAIMED IS: 

1 . A gene expression state estimating system for estimating the 
probability of gene expression in each channel, the system including an input 
device for sending gene expression level data, a program-controlled data 
analyzer, and an output device, wherein said data analyzer comprises 

distributed parameter estimating means for estimating distributed 
parameters of a mixed normal distribution shown in the following equation (25) 
using the gene expression level data from said input device, and sending the 
estimated distributed parameters: 

(1 - 5) <p (u - ]i 0 | ol) -H l<p (u - P! I ol) (25) 
where q>(* | a 2 ) represents the density function of a one-dimensional normal 
distribution with average 0 and variance a 2 , 0a 0 , a 2 ) and (yi 0 , of) are 
average and variance parameters of first and second components, 
respectively, and £ is the mixing ratio, with the assumption that 
li 0 < ]a X/ oq > 0, oi > 0, 0 < l < l is satisfied, 

mixing ratio parameter estimating means for estimating a mixing ratio 
parameter of the mixed normal distribution using the gene expression level data 
sent from said input device and the distributed parameters sent from said 
distributed parameter estimating means, and sending the estimated mixing ratio 
parameter, and 

posterior probability calculating means for calculating the posterior 
probability of the expression state of each gene in each channel using the gene 
expression level data, the estimated distributed parameters and mixing ratio 
parameter, and sending the calculated posterior probability to said output 
device. 

2. The system according to claim 1 wherein said distributed parameter 
estimating means estimates the mixing ratio (£), average (po, Mi), and variance 
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(oq/ o\ ) by applying the mixed normal distribution of two components to data 
on the sum of the amounts of expression of genes located in a region where the 
difference of gene expression levels X and Y of two channels is near 0. 



3. The system according to claim 2 wherein when the median value of 
the absolute difference | v ± | (i = l, . . . ,n) of the gene expression levels X 
and Y is c M , the data on the amounts of gene expression is shown by {u ± | | v ± | 
— c M / 1 = 1 / • • . /n}. 



4. The system according to claim 3 wherein said distributed parameter 
estimating means performs estimation by the use of the estimated 
1/ Po' s 0' Pi' 5 i according to the following equations (26), (27), (28), and (29) 
to estimate \x, o\, of, A ; 



p = (Pi - Po ) / 2 



-2 



2 Foll 



-2 1 A 2 

a p - - a 0 



1 S 2 



(26) 
(27) 

(28) 



X = Jlog 



1 + 



-2 -2"l 

A 2 

4p 



(29) 



where N 0 denotes an index set of data values that satisfies 

i g {i | u ± < ii 0 } and ||n 0 || denotes the number of elements. 



5. The system according to claim 4 wherein said mixing ratio 
parameter estimating means estimates the mixing ratio parameter p=(poo, P10, 
P01, P11) (where p 0 o denotes a mixing ratio when no gene is being expressed in 
both cells 1 and 2, pn denotes a mixing ratio when any gene is being expressed 
in both cells 1 and 2, pio denotes a mixing ratio when a gene is being expressed 
in cell 1 but not in cell 2, and g 0 i represents a mixing ratio when a gene is being 



expressed in cell 2 but not in cell 1 ) using e = (fi, X, of, of) given from said 
distributed parameter estimating means by applying a two-variable mixed 
normal distribution of four components shown in the following equation (30) to 
the gene expression level data {(u it Vj)|i=1, n} sent from said input device 

Poo9oo( u ' v | 9) + Pi 0 gio( u ' v | 6) + p 0 igoi(u, v | 6) + p n g n (u, v | e) 
= Poo<P< u I 4S p + 2a|)<p(v | 2a|) + Pi 0 <P2( u - fir v - ft | 2 10 ) 
+ P01<P2< U - fir v + fi | S 01 + Pn9(u - 2fi | 4fi 2 (e X - 1) 
+ 4ajj + 23e)<p(v | 2$l) 

where the above equation satisfies the relationships shown in the following 
equations (31) and (32). 



-10 - 



p 2 (e x 



1) + 4ap + 2o 2 



£ 2 (e X ^ - 1) 



£ 2 (e x - 1) 



£ 2 (e X - 1) + 2ai 



-2 



(31) 



'01 - 



V(e x2 - 1) + 48^ + 2o 2 - a 2 (e X ' - 1) 



- fi 2 (e x2 - 1) 



fa 2 (e X - 1) + 25 



-2 



(32) 



6. The system according to claim 5 wherein said posterior probability 
calculating means calculates the posterior probability of expression of any gene 
in cell 1 and cell 2 for each pair (u, v) of the gene expression level data sent 
from said input device according to the following equation (33) (where 
f (u, v | p, §) is given by the following equation (34)) and the following 
equation (35) (where ti, x 2 take either 1 or 0, which represents the presence or 
absence of true gene expression in each cell). 

Piogio( u ' v | e) + Pngn(ur v | §) 



Pr (tj_ = 1 | p, 9) = 

f (u, v | p, e) = 



f(u, v I p, 9) 



(33) 



2 Pjk9jk(u, v | §) (34) 
<j,k)e{0,l} 2 
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Pr (x 2 = 1 | p, §) = 



Poigoi( u ' v 1 §) + Pngn(u / v 1 8) 
f (u, v | p, 9) 



(35) 



7. The system according to claim 5 wherein said posterior probability 
calculating means calculates the posterior probability indicating an event of 
differential expression between cell 1 and cell 2 according to the following 
equation (36) 



where xi, x 2 take either 1 or 0, which represents the presence or absence of 
true gene expression in each cell. 

8. The system according to claim 6 or 7 wherein said posterior 
probability calculating means judges whether calculations of posterior 
probabilities of gene expression have been made for all the pairs (u, v) of the 
gene expression level data, and when all the calculations have been completed, 
it ends the process, while when all the calculations have not been completed 
yet, it calculates the posterior probability related to the next gene, such that the 
calculated posterior probabilities of gene expression in each channel are sent to 
said output device, and 

said output device displays the posterior probabilities of gene 
expression in each channel. 

9. A gene expression state estimating method for estimating the 
probability of gene expression in each channel based on gene expression level 
data, comprising the steps of: 

estimating distributed parameters of a mixed normal distribution shown 
in the following equation (37) using the gene expression level data and sending 
the estimated distributed parameters: 



Pr (x ± * x 2 | p, e) = 



Piq9iq( u ' v I 9) + p 0 ig 0 i(u / v I 9) 
f (u, v | p, 0) 



(36) 
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(1 - 5) 9 (u - Po I °o) + 5» (u - Pi I of) (37) 

where <p(* | a 2 ) represents the density function of a one-dimensional normal 
distribution with average 0 and variance a 2 , (p 0 / and (Pi' a i> are 
average and variance parameters of first and second components, 
respectively, and % is the mixing ratio, with the assumption that 

Po < Pi, °o > °' °i > °' 0 < I < 1 is satisfied, 

estimating a mixing ratio parameter of the mixed normal distribution 
using the gene expression level data and the estimated distributed parameters, 
and sending the estimated mixing ratio parameter, and 

calculating the posterior probability of the expression state of each gene 
in each channel using the gene expression level data, the estimated distributed 
parameters, and the estimated mixing ratio parameter, and sending the 
calculated posterior probability. 

10. The method according to claim 9 wherein said step of estimating 
the distributed parameters further comprises a step of estimating the mixing 
ratio average (p 0 , Mi)> and variance (oq, of) by applying the mixed normal 
distribution of two components to data on the sum of the amounts of expression 
of genes located in a region where the difference of gene expression levels X 
and Y of two channels is near 0. 

1 1 . The method according to claim 1 0 wherein when the median value 
of the absolute difference |v±| (i = l, . . . ,n) of the gene expression levels 
X and Y is c M , the data on the sum of the amounts of expression of genes is 
shown by {ui ||v±| ^ c M/ i=l # ...,n}. 



12. The method according to claim 1 1 wherein the estimation is 
performed in said step of estimating the distributed parameters by the use of the 
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estimated f, £ 0 , o§, fa 1# of according to the following equations (38), (39), 
(40), and (41) to estimate -p., o\. o\, X ; 
p = (Pi - p 0 ) / 2 



-2 

c e - 



-2 



2 N f 



OH ieN r 



1 ^ 2 
4 



1 i2 
2 



(38) 
(39) 

(40) 



X = 



i 



f 



log 



1 + 



-2 -2 ^ 
4p 2 



(41) 



where N 0 denotes an index set of data values that satisfies 

i e {i | Ui < fi 0 } and ||n 0 || denotes the number of elements. 



13. The method according to claim 12 wherein the estimation is 
performed in said step of estimating the mixing ratio parameter p=(p 0 o, P10, P01, 
P11) (where p 0 o denotes a mixing ratio when no gene is being expressed in both 
cells 1 and 2, pn denotes a mixing ratio when any gene is being expressed in 
both cells 1 and 2, p 10 denotes a mixing ratio when a gene is being expressed in 
cell 1 but not in cell 2, and g 0 i represents a mixing ratio when a gene is being 
expressed in cell 2 but not in cell 1) using § = (p, X, o\, 3p) sent from said 
step of estimating the distributed parameters by applying a two-variable mixed 
normal distribution of four components shown in the following equation (42) to 
the sent gene expression level data { (u ±/ vi) | i = l,...,n} 
Poo3oo< u ' v | 6) + Pi 0 gio( u ' v | 8) + p 0 igoi( u ' v | 9) + Piign(u, v | §) 



= Poo<P( u I 4S p + 2ag)cp(v | 23g) + p 10 <P2( u - P' v " P I S 10> 



,-2 



(42) 



2 X 

+ Poi ( P2( u - £' v + P I E 0l + Pll<P( u - 2p | 4p (e - 1) 
+ ±o\ + 2a 2 )(p (v | 2o\) 

where the above equation satisfies the relationships shown in the following 
equations (43) and (44). 
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E 10 ~ 



'p 2 (e x2 - 1) + 4a£ 
p 2 (e i2 - 1) 



23 2 



fi 2 (e X 



1) 



-2, A 



-2 



(43) 



p (e^ - 1) + 2a* 



'01 



p 2 (e X ' - 1) + 43 

- p 2 (e X ' - 1) 



-2 



25 2 



-2, A 



p (e" - 1) 



~ 2 , A 



-2 



(44) 



p (e - 1) + 2a* 



14. The method according to claim 13 wherein the calculation of the 
posterior probability of expression of any gene in cell 1 and cell 2 is made in 
said step of calculating the posterior probability for each pair (u, v) of the sent 
gene expression level data according to the following equation (45) (where 
f (u, v | p, §) is given by the following equation (46)) and the following 
equation (47) (where xi, t 2 take either 1 or 0, which represents the presence or 
absence of true gene expression in each cell). 

Pr(x 1 = 1 | p, §) = Piogio(u,v 1 8) + Pligil (u. v | 9) (4g) 

f (u, v I p, e) 

f(u, v | p, 9) = X Pjk9jk(U/ v | 0) (46) 

(j, k)e{0,l} 2 

Pr(x 2 = 1 | p, 9) = gOlgOl^ V ' §) + Pll5^ (u - V 1 §) (47) 

f (u # v | p, 9) 



1 5. The method according to claim 1 3 wherein the calculation of the 
posterior probability indicating an event of differential expression between cell 1 
and cell 2 is made in said step of calculating the posterior probability according 
to the following equation (48) 

Pr^ * x 2 | p, §) = Pl09l0(u, v I 9) + p 01 g 0 l(u, v | 8) 

f(u, v | p, e> 

where t-i, x 2 take either 1 or 0, which represents the presence or absence of 
true gene expression in each cell. 
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16. The method according to claim 14 or 15 wherein it is judged in 
said step of calculating the posterior probability whether calculations of posterior 
probabilities of gene expression have been made for all the pairs (u, v) of the 
gene expression level data, and when all the calculations have been completed, 
the process is ended, while when all the calculations have not been completed 
yet, the posterior probability related to the next gene is calculated. 

17. A gene expression state estimating program for estimating the 
probability of gene expression in each channel based on gene expression level 
data, the program instructing a computer to execute the steps of: 

estimating distributed parameters of a mixed normal distribution shown 
in the following equation (49) using the gene expression level data and sending 
the estimated distributed parameters: 

(1 - |) cp (u - p 0 I <*o) + 5<P (u - Pi | of) (49) 

where <p(* | a 2 ) represents the density function of a one-dimensional normal 
distribution with average 0 and variance a 2 , (p 0 , oq) and of ) are 
average and variance parameters of first and second components, 
respectively, and £ is the mixing ratio, with the assumption that 
Po < Pi, a o > °' °l > °' 0 < 5 < i is satisfied, 

estimating a mixing ratio parameter of the mixed normal distribution 
using the gene expression level data and the estimated distributed parameters, 
and sending the estimated mixing ratio parameter, and 

calculating the posterior probability of the expression state of each gene 
in each channel using the gene expression level data, the estimated distributed 
parameters, and the estimated mixing ratio parameter, and sending the 
calculated posterior probability. 
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18. A computer-readable recording medium storing a gene expression 
state estimating program for estimating the probability of gene expression in 
each channel based on gene expression level data, the program instructing a 
computer to execute the steps of: 

estimating distributed parameters of a mixed normal distribution shown 
in the following equation (50) using the gene expression level data and sending 
the estimated distributed parameters: 

(1 - S) <p (u - \jl 0 I a§) + |cp (u - i*! | ol) (50) 

where <p(* | a 2 ) represents the density function of a one-dimensional normal 
distribution with average 0 and variance a 2 , (p 0 , a 2 ) and (p 1# o 2 ) are 
average and variance parameters of first and second components, 
respectively, and % is the mixing ratio, with the assumption that 

Po < Pi, °o > °' °l > °' 0 < 5 < 1 is satisfied, 

estimating a mixing ratio parameter of the mixed normal distribution 
using the gene expression level data and the estimated distributed parameters, 
and sending the estimated mixing ratio parameter, 

calculating the posterior probability of the expression state of each gene 
in each channel using the gene expression level data, the estimated distributed 
parameters, and the estimated mixing ratio parameter, and sending the 
calculated posterior probability, and 

outputting the calculated posterior probability. 



